Hierarchical Load Balancing for Parallel Fast Legendre Transforms
نویسندگان
چکیده
We present a parallel Fast Legendre Transform (FLT) based on the Driscol{Healy algorithm with computation complexity O(N log 2 N). The parallel FLT is load{ balanced in a hierarchical fashion. We use a load{balanced FFT to deduce a load{ balanced parallel fast cosine transform, which in turn serves as a building block for the Legendre transform engine, from which the parallel FLT is constructed. We demonstrate how the arithmetic, memory and communication complexities of the parallel FLT are hierarchically derived via the complexity of its modular blocks.
منابع مشابه
Computational aspects of a code to study rotating turbulent convection in spherical shells
The coupling of highly turbulent convection with rotation within a full spherical shell geometry, such as in the solar convection zone, can be studied with the new anelastic spherical harmonic (ASH) code developed to exploit massively parallel architectures. Inter-processor transposes are used to ensure data locality in spectral transforms, a sophisticated load balancing algorithm is implemente...
متن کاملHierarchical Parallelization of MLFMA for the Efficient Solution of Large-Scale Electromagnetics Problems
We present the details of a hierarchical partitioning strategy for the efficient parallelization of the multilevel fast multipole algorithm (MLFMA) on distributedmemory architectures. Unlike previous parallelization approaches, this strategy is based on the simultaneous distribution of clusters and their fields by considering the optimal partitioning of each level separately. Using the hierarch...
متن کاملA Hierarchical Parallel Processing System for the Multipass-Rendering Method
The multipass-rendering method integrating radiosity with ray-tracing gives one of the best solutions for synthesizing photo-realistic images. However, the method is also computationally expensive. Therefore, parallel processing is the most promising approach to the fast multipass-rendering method. This paper presents a hierarchical parallel processing system for the multipass-rendering method....
متن کاملHierarchical Partitioning and Dynamic Load Balancing for Scientific Computation
Cluster and grid computing has made hierarchical and heterogeneous computing systems increasingly common as target environments for large-scale scientific computation. A cluster may consist of a network of multiprocessors. A grid computation may involve communication across slow interfaces. Modern supercomputers are often large clusters with hierarchical network structures. For maximum efficien...
متن کاملA Prototypical Self-Optimizing Package for Parallel Implementation of Fast Signal Transforms
This paper presents a self-adapting parallel package for computing the Walsh-Hadamard transform (WHT), a prototypical fast signal transform, similar to the fast Fourier transform. Using a search over a space of mathematical formulas representing different algorithms to compute the WHT, the package finds the best parallel implementation on a given shared-memory multiprocessor. The search automat...
متن کامل